GARUDA: A System for Large-Scale Mining of Statistically Significant Connected Subgraphs

نویسندگان

  • Satyajit Bhadange
  • Akhil Arora
  • Arnab Bhattacharya
چکیده

Unraveling “interesting” subgraphs corresponding to disease/crime hotspots or characterizing habitation shift patterns is an important graph mining task. With the availability and growth of large-scale real-world graphs, mining for such subgraphs has become the need of the hour for graph miners as well as non-technical end-users. In this demo, we present GARUDA, a system capable of mining large-scale graphs for statistically significant subgraphs in a scalable manner, and provide: (1) a detailed description of the various features and user-friendly GUI of GARUDA; (2) a brief description of the system architecture; and (3) a demonstration scenario for the audience. The demonstration showcases one real graph mining task as well as its ability to scale to large real graphs, portraying speedups of upto 8–10 times over the state-of-the-art MSCS algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Hierarchy Construction for Dense Subgraphs

Discovering dense subgraphs and understanding the relations among them is a fundamental problem in graph mining. We want to not only identify dense subgraphs, but also build a hierarchy among them (e.g., larger but sparser subgraphs formed by two smaller dense subgraphs). Peeling algorithms (k-core, k-truss, and nucleus decomposition) have been effective to locate many dense subgraphs. However,...

متن کامل

Statistically significant subgraphs for genome-wide association study

Genome-wide association studies (GWAS) have been widely used for understanding the associations of single-nucleotide polymorphisms (SNPs) with a disease. GWAS data are often combined with known biological networks, and they have been analyzed using graphmining techniques toward a systems understanding of the biological changes caused by the SNPs. To determine which subgraphs are associated with...

متن کامل

Dynamic Modelling of a Compressed Air Energy Storage System in a Grid Connected Photovoltaic Plant

The use of photovoltaic (PV) cells in domestic and industrial applications has grown rapidly through the recent years. Constructing PV plants is a very smart measure to produce free electricity in large scales, especially in the countries with higher solar irradiation potential. On the other hand, compressed air energy storage (CAES) has already been proposed to be employed for energy storage a...

متن کامل

A Closed Frequent Subgraph Mining Algorithm in Unique Edge Label Graphs

Problems such as closed frequent subset mining, itemset mining, and connected tree mining can be solved in a polynomial delay. However, the problem of mining closed frequent connected subgraphs is a problem that requires an exponential time. In this paper, we present ECE-CloseSG, an algorithm for finding closed frequent unique edge label subgraphs. ECE-CloseSG uses a search space pruning and ap...

متن کامل

Arabesque: A System for Distributed Graph Mining - Extended version

Distributed data processing platforms such as MapReduce and Pregel have substantially simplified the design and deployment of certain classes of distributed graph analytics algorithms. However, these platforms do not represent a good match for distributed graph mining problems, as for example finding frequent subgraphs in a graph. Given an input graph, these problems require exploring a very la...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2016